Main¶

=============== <Original Dataset> ===============
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20640 entries, 0 to 20639
Data columns (total 10 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   longitude           20640 non-null  float64
 1   latitude            20640 non-null  float64
 2   housing_median_age  20640 non-null  float64
 3   total_rooms         20640 non-null  float64
 4   total_bedrooms      20433 non-null  float64
 5   population          20640 non-null  float64
 6   households          20640 non-null  float64
 7   median_income       20640 non-null  float64
 8   median_house_value  20640 non-null  float64
 9   ocean_proximity     20640 non-null  object 
dtypes: float64(9), object(1)
memory usage: 1.6+ MB
None

longitude latitude housing_median_age total_rooms total_bedrooms population households median_income median_house_value ocean_proximity
0 -122.23 37.88 41.0 880.0 129.0 322.0 126.0 8.3252 452600.0 NEAR BAY
1 -122.22 37.86 21.0 7099.0 1106.0 2401.0 1138.0 8.3014 358500.0 NEAR BAY
2 -122.24 37.85 52.0 1467.0 190.0 496.0 177.0 7.2574 352100.0 NEAR BAY
3 -122.25 37.85 52.0 1274.0 235.0 558.0 219.0 5.6431 341300.0 NEAR BAY
4 -122.25 37.85 52.0 1627.0 280.0 565.0 259.0 3.8462 342200.0 NEAR BAY
... ... ... ... ... ... ... ... ... ... ...
20635 -121.09 39.48 25.0 1665.0 374.0 845.0 330.0 1.5603 78100.0 INLAND
20636 -121.21 39.49 18.0 697.0 150.0 356.0 114.0 2.5568 77100.0 INLAND
20637 -121.22 39.43 17.0 2254.0 485.0 1007.0 433.0 1.7000 92300.0 INLAND
20638 -121.32 39.43 18.0 1860.0 409.0 741.0 349.0 1.8672 84700.0 INLAND
20639 -121.24 39.37 16.0 2785.0 616.0 1387.0 530.0 2.3886 89400.0 INLAND

20640 rows × 10 columns

=============== <Modified Dataset> ===============
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20433 entries, 0 to 20432
Data columns (total 9 columns):
 #   Column              Non-Null Count  Dtype  
---  ------              --------------  -----  
 0   longitude           20433 non-null  float64
 1   latitude            20433 non-null  float64
 2   housing_median_age  20433 non-null  float64
 3   total_rooms         20433 non-null  float64
 4   total_bedrooms      20433 non-null  float64
 5   population          20433 non-null  float64
 6   households          20433 non-null  float64
 7   median_income       20433 non-null  float64
 8   ocean_proximity     20433 non-null  object 
dtypes: float64(8), object(1)
memory usage: 1.4+ MB
None

longitude latitude housing_median_age total_rooms total_bedrooms population households median_income ocean_proximity
0 -122.23 37.88 41.0 880.0 129.0 322.0 126.0 8.3252 NEAR BAY
1 -122.22 37.86 21.0 7099.0 1106.0 2401.0 1138.0 8.3014 NEAR BAY
2 -122.24 37.85 52.0 1467.0 190.0 496.0 177.0 7.2574 NEAR BAY
3 -122.25 37.85 52.0 1274.0 235.0 558.0 219.0 5.6431 NEAR BAY
4 -122.25 37.85 52.0 1627.0 280.0 565.0 259.0 3.8462 NEAR BAY
... ... ... ... ... ... ... ... ... ...
20428 -121.09 39.48 25.0 1665.0 374.0 845.0 330.0 1.5603 INLAND
20429 -121.21 39.49 18.0 697.0 150.0 356.0 114.0 2.5568 INLAND
20430 -121.22 39.43 17.0 2254.0 485.0 1007.0 433.0 1.7000 INLAND
20431 -121.32 39.43 18.0 1860.0 409.0 741.0 349.0 1.8672 INLAND
20432 -121.24 39.37 16.0 2785.0 616.0 1387.0 530.0 2.3886 INLAND

20433 rows × 9 columns

=============== AutoML Start ===============
=============== Model : GMM ===============
Start calculating silhouette_score...( method = GMM )
best K_s = [2, 3]
covariance_type = full / init_params = kmeans / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0    16735
1.0     3698
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    179500.0
1.0    181800.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    22500.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    206703.069913
1.0    210049.914548
Name: median_house_value, dtype: float64
covariance_type = full / init_params = random / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0     3783
1.0    16650
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    182900.0
1.0    179300.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    22500.0
1.0    14999.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    210182.047846
1.0    206655.962282
Name: median_house_value, dtype: float64
covariance_type = tied / init_params = kmeans / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0    20269
1.0      164
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    180600.0
1.0    119050.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    48300.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    207814.374562
1.0    144822.567073
Name: median_house_value, dtype: float64
covariance_type = tied / init_params = random / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0     8983
1.0    11450
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    170500.0
1.0    189100.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    14999.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    197207.962819
1.0    215233.303843
Name: median_house_value, dtype: float64
covariance_type = diag / init_params = kmeans / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0    15792
1.0     4641
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    178300.0
1.0    185400.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    14999.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    206122.450355
1.0    211345.555484
Name: median_house_value, dtype: float64
covariance_type = diag / init_params = random / k = 2 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0     4632
1.0    15801
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    185400.0
1.0    178300.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    14999.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    211398.817573
1.0    206109.811784
Name: median_house_value, dtype: float64
covariance_type = full / init_params = kmeans / k = 3 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0    10396
1.0     3109
2.0     6928
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
2.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    162700.0
1.0    177100.0
2.0    195400.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    22500.0
2.0    22500.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    192951.768276
1.0    205003.323898
2.0    229887.202945
Name: median_house_value, dtype: float64
covariance_type = full / init_params = random / k = 3 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0    10937
1.0     2832
2.0     6664
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
2.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    167700.0
1.0    154300.0
2.0    204250.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    22500.0
2.0    14999.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    196782.568803
1.0    188129.555085
2.0    232735.084634
Name: median_house_value, dtype: float64
covariance_type = tied / init_params = kmeans / k = 3 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0     8599
1.0      155
2.0    11679
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
2.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    175000.0
1.0    116300.0
2.0    183100.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    48300.0
2.0    14999.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    200490.849401
1.0    140995.490323
2.0    213208.780204
Name: median_house_value, dtype: float64
covariance_type = tied / init_params = random / k = 3 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0    9284
1.0    5805
2.0    5344
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
2.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    192200.0
1.0    139700.0
2.0    193500.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    14999.0
2.0    22500.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    221758.488798
1.0    173236.156589
2.0    219217.582335
Name: median_house_value, dtype: float64
covariance_type = diag / init_params = kmeans / k = 3 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0    7709
1.0    3386
2.0    9338
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
2.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    169300.0
1.0    181300.0
2.0    184400.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    22500.0
2.0    14999.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    198071.215333
1.0    207559.938866
2.0    214843.810987
Name: median_house_value, dtype: float64
covariance_type = diag / init_params = random / k = 3 Done.
<Figure size 432x288 with 0 Axes>
========== Compare with original labels ==========
===count===
predict
0.0     7713
1.0    10109
2.0     2611
Name: median_house_value, dtype: int64
===max===
predict
0.0    500001.0
1.0    500001.0
2.0    500001.0
Name: median_house_value, dtype: float64
===median===
predict
0.0    182700.0
1.0    175800.0
2.0    187200.0
Name: median_house_value, dtype: float64
===min===
predict
0.0    14999.0
1.0    14999.0
2.0    22500.0
Name: median_house_value, dtype: float64
===mean===
predict
0.0    210227.068456
1.0    203538.031160
2.0    213287.293374
Name: median_house_value, dtype: float64